Sorting streamed multisets
نویسنده
چکیده
Sorting is a classic problem and one to which many others reduce easily. In the streaming model, however, we are allowed only one pass over the input and sublinear memory, so in general we cannot sort. In this paper we show that, to determine the sorted order of a multiset s of size n containing σ distinct elements using one pass and o(n log σ) bits of memory, it is generally necessary and sufficient that its entropy H = o(log σ). Specifically, if s = {s1, . . . , sn} and si1 , . . . , sin is the stable sort of s, then we can compute i1, . . . , in in one pass using O((H+1)n) time, O(σ) words plus O((H+1)n) bits of memory, and a simple combination of classic techniques. On the other hand, in the worst case it takes Ω(Hn) bits of memory to compute any sorted ordering of s in one pass.
منابع مشابه
Efficient and Scalable Parallel Algorithm for Sorting Multisets on Multi-core Systems
By distributing adaptively the data blocks to the processing cores to balance their computation loads and applying the strategy of “the extremum of the extremums” to select the data with the same keys, a cache-efficient and thread-level parallel algorithm for sorting Multisets on the multi-core computers is proposed. For the sorting Multisets problem, an aperiodic multi-round data distribution ...
متن کاملAn Analysis of Permutations in Arrays
This paper is concerned with the synthesis of invariants in programs with arrays. More specifically, we consider properties concerning array contents up to a permutation. For instance, to prove a sorting procedure, one has to show that the result is sorted, but also that it is a permutation of the initial array. In order to analyze this kind of properties, we define an abstract interpretation w...
متن کاملDistribution-Sensitive Algorithms
We investigate a new paradigm of algorithm design for geometric problems that can be termed distribution-sensitive. Our notion of distribution is more combinatorial in nature than spatial. We illustrate this on problems like planar-hulls and 2D-maxima where some of the previously known output-sensitive algorithms are recast in this setting. In a number of cases, the distribution-sensitive analy...
متن کاملEfficient Translation of External Input in a Dynamically Typed Language
New algorithms are given to compile external data in string form into data structures for high level datatypes. Let I be a language of external constants formed from atomic constants and from set, multiset, and tuple constructors. We show how to read an input string C, decide whether it belongs to I, convert it to internal form, and build initial data structures storing the internal value of C ...
متن کاملAn Optimal Parallel Algorithm for Sorting Multisets
We consider the problem of sorting n numbers that contain only k distinct values. We present a randomized arbitrary CRCW PRAM algorithm that runs in O(logn) time using n log k logn processors. The same algorithm runs in O ( logn log logn ) time with a total work of O(n(log k) ) for any fixed > 0. All the stated bounds hold with high probability.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007